Goto

Collaborating Authors

 end-task accuracy


Review for NeurIPS paper: BERT Loses Patience: Fast and Robust Inference with Early Exit

Neural Information Processing Systems

Summary and Contributions: The authors proposes early stopping at test-time to improve inference speed and accuracy. The idea is to train a classifier at each layer of multi-layered embedding model like BERT and perform classification one layer at time, stopping when the prediction stops changing. They demonstrate empirically that the method improves both the speed and accuracy of BERT/ALBERT on the GLUE benchmarks. My opinion of the work remains the same after the response. Strengths: Simple straightforward idea that would be easy to implement directly from the description of the paper and that performs better in some cases than more complicated methods.